Evanescence-based multiplex sequencing method

ABSTRACT

The invention relates to a method and a device for evanescence-based multiplex sequencing of nucleic acid molecules immobilized on a support.

The invention concerns a method and a device for an evanescence-basedmultiplex sequencing of nucleic acid molecules immobilized on a support.

The sequencing of the human genome consisting of about 3×10⁹ bases or ofthe genome of other organisms as well as the determination andcomparison of individual sequence variants requires the provision ofsequencing methods that are rapid and can also be used routinely andinexpensively. Although major attempts have been made to accelerateconventional sequencing methods such as the enzymatic chain terminationmethod of Sanger et al. (Proc. Natl. Acad. Sci. USA 74 (1977) 5463)especially by automation (Adams et al. Automated DNA Sequencing andAnalysis (1994), New York, Academic Press), at present no more than 2000bases per day can be determined with a sequencer.

New approaches for overcoming the limitations of conventional sequencingmethods have been developed in the last few years which includesequencing by scanning-tunnel microscopy (Lindsay and Phillip, Gen.Anal. Tech. Appl. 8 (1991), 8-13), by highly parallelized capillaryelectrophoresis (Huang et al., Anal. Chem. 64 (1992), 2149-2154; Kambaraand Takahashi, Nature 361 (1993), 565-566), by oligonucleotidehybridization (Drmanac et al., Genomics 4 (1989), 114-128; Khrapko etal., FEBS Let. 256 (1989), 118-122; Maskos and Southern, Nucleic AcidsRes. 20 (1992), 1675-1678 and 1679-1684) and by matrix-assisted laserdesorption/ionization mass spectroscopy (Hillenkamp et al., Anal. Chem.63 (1991), 1193A-1203A).

Another method is single molecule sequencing (Dorre et al., Bioimaging 5(1997), 139-152) in which nucleic acids are sequenced by progressiveenzymatic degradation of fluorescent-labelled single-stranded DNAmolecules and detection of the sequentially released monomer moleculesin a microstructure channel. The advantage of this method is that only asingle molecule of the target nucleic acid is sufficient to carry out asequence determination.

Although considerable advances have been made by using theabove-mentioned methods, there is a major need for further improvements.Hence the object of the present invention was to provide a method forsequencing nucleic acids which represents a further improvement over theprior art and which allows a parallel determination of individualnucleic acid molecules in a multiplex format.

A multiplex sequencing method is proposed in PCT/EP01/07462 in whichnucleic acid molecules that carry several fluorescent marker groups areprovided in an immobilized form on a support and the base sequence ofseveral nucleic acid molecules is determined simultaneously on the basisof the time-dependent change in the fluorescence of the nucleic acidmolecules or/and of the cleaved nucleotide building blocks caused by thecleavage of nucleotide building blocks.

The subject matter of the present application is a method for sequencingnucleic acids comprising the steps:

-   (a) providing an at least partially optically transparent support    with a multitude nucleic acid molecules immobilized thereon where    the nucleic acid molecules carry several fluorescent marker groups,-   (b) progressive cleavage of individual nucleotide building blocks    from the immobilized nucleic acid molecules and-   (c) simultaneous determination of the base sequence of a plurality    of nucleic acid molecules based on the time-dependent change in the    fluorescence of the nucleic acid molecules or/and of the cleaved    nucleotide building blocks caused by the cleavage of nucleotide    building blocks, wherein the fluorescence is produced by beaming    light into the support and generating an evanescent excitation field    by internal reflection at the support surface in the area of the    immobilized nucleic acid molecules.

The method according to the invention is a support-based multiplexsequencing method in which a multitude of immobilized nucleic acidmolecules are examined simultaneously. The support used for the methodcan be any desired planar or structured support that is suitable forimmobilizing nucleic acid molecules and has, at least in the area of theimmobilized nucleic acids sufficient optical transparency and suitablesurface properties for an evanescence-based detection of fluorescence.Examples of suitable support materials are glass, quartz, plastic orcomposite materials containing these materials. In principle the supportcan be designed in any manner, provided a reaction space can be formedwhich allows the progressive cleavage of individual nucleotide buildingblocks from nucleic acids immobilized on the support in a liquidreaction mixture.

The nucleic acid molecules that can be in a single-stranded form or in adouble stranded form are preferably immobilized on the support via their5′ or 3′ ends. In the case of double-stranded molecules it must beensured that labelled nucleotide building blocks can only be cleavedfrom a single strand. The nucleic acid molecules can be bound to thesupport by means of covalent or non-covalent interactions. For examplethe binding of polynucleotides to the support can be mediated by highaffinity interactions between the partners of a specific binding pair,e.g. for example mediated by biotin/streptavidin or avidin,hapten/anti-hapten-antibody, sugar/lectin etc. Thus biotinylated nucleicacid molecules can be coupled to streptavidin-coated supports.Alternatively the nucleic acid molecules can also be adsorptively boundto the support. Hence nucleic acid molecules modified by incorporationof alkanethiol groups can be bound to metallic supports e.g. goldsupports. Another alternative is covalent immobilization in which thebinding of the polynucleotides can be mediated by reactive silane groupson a silica surface.

A plurality of nucleic acid molecules that are to be sequenced are boundto a support. Preferably at least 100, particularly preferably at least1,000 and especially preferably at least 10,000 and up to more than 10⁶nucleic acid molecules are bound to the support. The bound nucleic acidfragments have a length of preferably 200 to 2,000 nucleotides,particularly preferably 400 to 1,000 nucleotides. The nucleic acidmolecules bound to the support, e.g. DNA molecules or RNA molecules,contain a plurality of fluorescent marker groups and preferably at least50%, particularly preferably at least 70% and most preferablyessentially all, e.g. at least 90%, of the nucleotide building blocks ofone base type carry a fluorescent marker group. Nucleic acids labelledin this manner can be produced by enzymatic primer extension on anucleic acid template using a suitable polymerase e.g. a DNA polymerasesuch as Taq polymerase, a thermostable DNA polymerase from Thermococcusgorgonarius or other thermostable organisms (Hopfner et al., PNAS USA 96(1999), 3600-3605) or a mutated Taq polymerase (Patel and Loeb, PNAS USA97 (2000), 5095-5100) using fluorescent-labelled nucleotide buildingblocks.

The labelled nucleic acid molecules can also be produced byamplification reactions e.g. PCR. Thus in an asymmetric PCR,amplification products are formed in which only a single strand containsfluorescent labels. Such asymmetric amplification products can besequenced in a double-stranded form. Nucleic acid fragments are formedby symmetrical PCR in which both strands are fluorescent labelled. Thesetwo fluorescent labelled strands can be separated and immobilizedseparately in a single-stranded form so that the sequence of one or bothcomplementary strands can be determined separately. Alternatively one ofthe two strands can be modified at the 3′ end for example byincorporation of a PNA clip, such that monomer building blocks can nolonger be cleaved off. In this case double-strand sequencing ispossible.

Preferably essentially all nucleotide building blocks of at least twobase types, for example two, three of four base types, carry afluorescent label and each base type advantageously carries a differentfluorescent marker group. If the nucleic acid molecules are notcompletely labelled, the sequence can nevertheless be completelydetermined by sequencing a plurality of molecules in parallel.

The nucleic acid template whose sequence is to be determined, can forexample be selected from DNA templates such as genomic DNA fragments,cDNA molecules, plasmids etc. and also from RNA templates such as mRNAmolecules.

The fluorescent marker groups can be selected from known fluorescentmarker groups used to label biopolymers e.g. nucleic acids such asfluorescein, rhodamine, phycoerythrin, Cy3, Cy5 or derivatives thereofetc.

The method according to the invention is based on the fact thatfluorescent marker groups incorporated into nucleic acid strandsinteract with neighbouring groups, for example with chemical groups ofthe nucleic acids and in particular with nucleobases such as G, or/andwith adjacent fluorescent marker groups which results in a change in thefluorescence and in particular in the fluorescence intensity compared tothat of the fluorescent marker groups in an isolated form due toquenching or/and energy transfer processes. Cleavage of individualnucleotide building blocks results in a change in the total fluorescencee.g. the fluorescence intensity of an immobilized nucleic acid strand ischanged in a manner depending on the cleavage of individual nucleotidebuilding blocks i.e. as a function of time. This change in thefluorescence over time can be detected in parallel for a plurality ofnucleic acid molecules and can be correlated with the base sequence ofindividual nucleic acid strands. Fluorescent marker groups arepreferably used which are at least partially quenched when they areincorporated into the nucleic acid strand such that after cleavage ofthe nucleotide building block containing the marker group or of aneighbouring building block which causes the quenching, the fluorescenceintensity is increased.

The sequencing reaction of the method according to the inventioncomprises the progressive cleavage of individual nucleotide buildingblocks from the immobilized nucleic acid molecules. An enzymaticcleavage is preferably carried out using an exonuclease in which singlestrand or double strand exonucleases that degrade in the 5′→3′ directionor 3′→5′ direction can be used depending on the manner in which thenucleic acid strands are immobilized on the support. T7 DNA polymerase,E. coli exonuclease I or E. coli exonuclease III are particularlypreferably used as exonucleases.

A change in the fluorescence intensity of the immobilized nucleic acidstrand or/and of the cleaved nucleotide building blocks due to quenchingor energy transfer processes can be measured during the progressivecleavage of individual nucleotide building blocks. This change in thefluorescence intensity over time depends on the base sequence of theexamined nucleic acid strand and can therefore be correlated with thesequence. In order to completely determine the sequence of a nucleicacid strand, several nucleic acid strands labelled on different basese.g. A, G, C and T or combinations of two different bases are preferablygenerated by enzymatic primer extension as described above andimmobilized on the support where the immobilization can be at randomsites on the support or can be carried out in a site-specific manner. Asequence identifier may optionally also be attached to the nucleic acidstrand to be examined e.g. a labelled nucleic acid of a known sequence,for example by means of an enzymatic reaction using ligase or/andterminal transferase such that firstly a known fluorescence pattern isobtained at the start of the sequencing and the fluorescence pattern ofthe unknown sequence to be examined is only obtained afterwards.Preferably a total of 10³ to 10⁶ nucleic acid strands are immobilized ona support.

In order to accelerate the removal of cleaved nucleotide building blocksfrom the immobilized nucleotide strands, a convection flow is preferablygenerated in the reaction space away from the support. The flow rate canbe in the range of 1 to 10 mm/s.

The detection comprises beaming light into the support preferably bymeans of a laser. One or several laser beams can be used for this e.g. awidened laser beam with a cross-section of ca. 1-20 mm or/and multiplelaser beams. An evanescent excitation field is generated by internalreflection at one or more positions of the support surface in the areaof immobilized nucleic acid molecules which excites the fluorescentmarker groups on the nucleic acid molecules immobilized on the support.The reflection on the support surface is preferably a total internalreflection.

The fluorescence emission of a plurality of nucleic acid strandsgenerated by evanescent excitation can be detected in parallel using adetector matrix which for example comprises an electronic detectionmatrix e.g. a CCD camera or an avalanche photodiode matrix. Detectioncan be such that fluorescence excitation and detection occursconcurrently on all examined nucleic acid strands. Alternatively thenucleic acid strands can be examined portion by portion in severalsteps. It is preferable to detect the fluorescence light that isirradiated essentially orthogonally from the support surface.

Yet another subject matter of the invention is a device for sequencingnucleic acids comprising

-   (a) an at least partially optically transparent support comprising a    multitude of nucleic acid molecules immobilized thereon where the    nucleic acid molecules are present in a single-stranded form and    carry several fluorescent marker groups,-   (b) a reaction space for the progressive cleavage of individual    nucleotide building blocks from the immobilized nucleic acid    molecules,-   (c) means for exciting fluorescence by beaming light into the    support and generating an evanescent excitation field by internal    reflection at the support surface in the area of the immobilized    nucleic acid molecules and-   (d) means for simultaneously determining the base sequence of a    plurality of nucleic acid molecules based on the time-dependent    change in fluorescence of the nucleic acid molecules or/and of the    cleaved nucleotide building blocks caused by cleavage of nucleotide    building blocks.

The method according to the invention and the device according to theinvention can be used for example to analyse genomes and transcriptomesor for differential analyses e.g. investigations of differences in thegenome or transcriptome of individual species or organisms within aspecies.

The present invention is further elucidated by the following figures.

FIG. 1 shows a schematic representation of an optically transparentsupport (2) according to the invention with a multitude ofsingle-stranded labelled nucleic acid molecules (4) immobilized thereon.A support with an area of 1 to 2 cm² can for example contain up to 10⁶nucleic acid strands.

FIG. 2 shows a first embodiment of the invention in which excitationlight (6) is beamed into the optically transparent support (2) withnucleic acid molecules (4) immobilized thereon by a widened laser andthe light emerges again from the support (2) after reflection at theglass surface in the area of the immobilized nucleic acid molecules (4).The immobilized nucleic acid molecules (4) are excited to fluoresce bythe evanescent excitation field. The emission light (8) is guided by anoptical system (10) onto a detector (12).

In the embodiment shown in FIG. 3A evanescent excitation fields aregenerated by multiple reflections (14 a, 14 b, 14 c) in the opticallytransparent support (2). The evanescent excitation fields can forexample be present in the form of strips (FIG. 3B) or points (FIG. 3C).

Alternatively it is also possible to beam several foci of the laserlight into the support by using a diffractive optical system such asthat disclosed in DE 101 26 083.0.

The advantages achieved by the method according to the invention and thedevice according to the invention are in particular that fluorescenceexcitation and measurement can take place on different sides. Thisresults in lower background radiation and thus a higher measuringsensitivity.

1. Method for sequencing nucleic acids comprising the steps: (a)providing an at least partially optically transparent support with amultitude nucleic acid molecules immobilized thereon where the nucleicacid molecules carry several fluorescent marker groups, (b) progressivecleavage of individual nucleotide building blocks from the immobilizednucleic acid molecules and simultaneous determination of the basesequence of a plurality of nucleic acid molecules based on thetime-dependent change in the fluorescence of the nucleic acid moleculesor/and of the cleaved nucleotide building blocks caused by the cleavageof nucleotide building blocks wherein the fluorescence is produced bybeaming light into the support and generating an evanescent excitationfield by internal reflection at the support surface in the area of theimmobilized nucleic acid molecules.
 2. Method as claimed in claim 1,wherein a support made of glass, plastics, quartz or a compositecontaining one or more of these materials is used.
 3. Method as claimedin claim 1, wherein a total internal reflection is generated at thesupport surface.
 4. Method as claimed in claim 1, wherein the nucleicacid molecules are labelled in such a manner that at least 50% of allnucleotide building blocks of one base type carry a fluorescent markergroup.
 5. Method as claimed in claim 4, wherein essentially allnucleotide building blocks of one base type carry a fluorescent markergroup.
 6. Method as claimed in claim 1, wherein individual nucleotidebuilding blocks are cleaved by an exonuclease.
 7. Method as claimed inclaim 6, wherein T7 DNA polymerase, E. coli exonuclease I or E. coliexonuclease III are used.
 8. Method as claimed in claim 1, wherein theevanescent field is generated by beaming in light using a widened laser.9. Method as claimed in claim 1, wherein the evanescent field isgenerated at multiple areas on the support by irradiating it withmultiple laser beams or/and by means of multiple internal reflections.10. Method as claimed in claim 1, wherein the determination of the basesequence comprises a detection of the fluorescence emission of aplurality of nucleic acid strands by a detection matrix.
 11. Method asclaimed in claim 10, wherein a CCD camera or an avalanche photodiodematrix is used.
 12. Method as claimed in claim 1, wherein thefluorescence excitation and detection of all examined nucleic acidstrands is carried out in parallel.
 13. Method as claimed in claim 1,wherein the fluorescence excitation and detection is carried out inseveral steps and in each case on a portion of the nucleic acid strandsto be examined.
 14. Method as claimed in claim 1, wherein a convectionflow away from the support is generated during the determination. 15.Method as claimed in claim 1, wherein the fluorescent marker groups areat least partially quenched when they are incorporated into the nucleicacid strands and the fluorescence intensity is increased after cleavage.16. Device for sequencing nucleic acids comprising: (a) an at leastpartially optically transparent support comprising a multitude ofnucleic acid molecules immobilized thereon where the nucleic acidmolecules are present in a single-stranded form and carry severalfluorescent marker groups, (b) a reaction space for the progressivecleavage of individual nucleotide building blocks from the immobilizednucleic acid molecules, (c) means for exciting fluorescence by beaminglight into the support and generating an evanescent excitation field byinternal reflection at the support surface in the area of theimmobilized nucleic acid molecules and (d) means for simultaneouslydetermining the base sequence of a plurality of nucleic acid moleculesbased on the time-dependent change in fluorescence of the nucleic acidmolecules or/and of the cleaved nucleotide building blocks caused bycleavage of nucleotide building blocks.
 17. (canceled)